Stream on the Sky: Outsourcing Access Control 
Enforcement for Stream Data to the Cloud 



Tien Tuan Anh Dinh, Anwitaman Datta 
School of Computer Engineering, Nanyang Technological University, Singapore 
{ttadinh,anwitaman} @ntu.edu.sg 



Abstract — There is an increasing trend for businesses to mi- 
grate their systems towards the cloud. Security concerns that 
arise when outsourcing data and computation to the cloud include 
data confidentiality and privacy. Given that a tremendous amount 
of data is being generated everyday from plethora of devices 
equipped with sensing capabilities, we focus on the problem 
of access controls over live streams of data based on triggers 
or sliding windows, which is a distinct and more challenging 
problem than access control over archival data. Specifically, we 
investigate secure mechanisms for outsourcing access control 
enforcement for stream data to the cloud. We devise a system 
that allows data owners to specify fine-grained policies associated 
with their data streams, then to encrypt the streams and relay 
them to the cloud for live processing and storage for future 
use. The access control policies are enforced by the cloud, 
without the latter learning about the data, while ensuring that 
unauthorized access is not feasible. To realize these ends, we 
employ a novel cryptographic primitive, namely proxy-based 
attribute-based encryption, which not only provides security 
but also allows the cloud to perform expensive computations 
on behalf of the users. Our approach is holistic, in that these 
controls are integrated with an XML based framework (XACML) 
for high-level management of policies. Experiments with our 
prototype demonstrate the feasibility of such mechanisms, and 
early evaluations suggest graceful scalability with increasing 
numbers of policies, data streams and users. 

I. Introduction 

An enormous amount of data is being generated contin- 
uously, by organizations and individuals carrying out their 
day-to-day activities, as well as by dedicated sensing and 
monitoring infrastructures. Examples include financial ser- 
vices for monitoring stock prices (6), sensor networks for 
meteorological, environmental [3|, battle field and traffic con- 
trol @, iflOl monitoring. As personal devices, especially those 
equipped with sensing capabilities, are enjoying an unprece- 
dented growth, they are also becoming a prominent source of 
stream data. Applications such as participatory sensing [22|, 
[28 1 and personal health monitoring 1 1 ] collect continuous data 
from sensors fitted in smart phones or in other mobile devices. 

The abundance of such data brings many new opportunities, 
such as real time decision making and resource management 
at various scales - from personal area or home networks to 
smart planet. Often, these applications assume sharing and 
mash-up of data from multiple sources, possibly created and 
owned by different stake holders. One important requirement 
is to equip data owners with adequate control for determining 
different granularity in which the sharing is done with various 
parties. Another requirement is the infrastructure over which 



such sharing can be done in a scalable manner. 

One can presume that the need for a scalable infrastruc- 
ture can be readily realized thanks to the advent of cloud 
computing. Businesses are moving their computing systems 
cloud-ward, availing themselves of the elastic resources, ease 
of management and good cost-benefit ttade-off ifTTl . |25|, 
lfT8l . Il38ll . While recent advances in the technologies have 
given users more control and better performance lfl3ll . [37 1, 
limited progress has been made towards security guarantees, 
particularly vis-a-vis sharing data with multiple other parties 
in possibly different granularity. In general, many argue IfTTl . 
Il35ll . [34 1 that security remains one of the biggest obstacles 
to be overcome before the potential of the cloud can be fully 
realized. 

This paper is an important step exploring the design space 
of outsourcing the enforcement of access control of live data 
streams to the cloud. Our goal is to design a system that 
supports fine-grained access control policies in which most of 
the expensive computations are done by the cloud. We assume 
that a data owner outsources its stream data to an untrusted 
cloud. The system must allow the data owner to specify fine- 
grained access control policies for the data stream. It must 
ensure security with respect to access control enforcement, i.e. 
no unauthorized access is allowed even if the cloud colludes 
with dishonest data users. Finally, the system must protect 
privacy of the outsourced data. 

Access control for stream data is more challenging than for 
traditional non-stream (archival) data. In archival databases, 
access is defined on views |36| which are constructed by 
querying the databases. Since views are static and can be 
pre-computed, this technique is not applicable to stream data 
because of the potentially infinite size and queries being 
continuous and are carried out over newly arrived data. Most 
critically, the nature of stream data demands support for 
more fine-grained policies, especially those involving temporal 
constraints [17|. Specifically, most policies can be categorized 
as trigger or sliding window policies: 

- Trigger: A user is given access to a data record when its 
content satisfy a certain condition. For example, in a personal 
health monitoring system [1 1, a user tracks his blood pressure, 
heart rate, blood sugar, etc. using a portable device at regular 
intervals. On the one hand, such data is crucial for doctors to 
monitor patient recovery and to make early diagnosis. On the 
other hand, the user is concerned with his privacy and would 
not wish to disclose all the measurements. As a result, he can 
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specify a policy dictating that certain doctors have access to 
his data only when his blood pressure exceeds a threshold. 

- Sliding window: A user is given summaries of the data 
over specific windows, as opposed to having access to raw 
data. For example, a financial company monitoring stock 
prices every 0.5 second can sell its raw data for a very high 
price. It can also offer more coarse-grained packages at lower 
prices which contain only the average stock prices over s- 
second interval (s > 0.5). This gives the data owner greater 
flexibility in managing and generating profit from its data. 

A naive design to realize these goals would be to encrypt 
the data using a symmetric encryption scheme, store it on 
the cloud and distribute the decryption keys to authorized 
users. Such a system [34], [27| relies on the data owner 
for access control enforcement, while the cloud acts only 
as a transit storage provider. Use of a more advanced en- 
cryption scheme, namely attribute-based encryption (ABE) 
scheme G3l . |[T4"1 . can help relieve the data owner from much 
of the key management tasks. However, ABE deals only with 
individual data objects and therefore can not naturally support 
sliding window policies. Additionally, ABE effectively pushes 
the access control management task towards the end-user 
who needs to check all the ciphertexts and discard that that 
will result in invalid decryption. Such a modus operandi is 
inefficient and does not scale for large systems with many 
different data streams. 

In this paper, we propose to use a proxy attribute-based 
encryption scheme E4ll for trigger policies, and extend it 
with support for sliding window policies. Out system ensures 
access control while also offloading expensive computations 
to the cloud. Furthermore, we decouple the task of policy 
management from security enforcement by using the XACML 
framework [4|. The cloud handles this task seamlessly, and as 
a result the end-user only receives and decrypts ciphertexts of 
his authorized data. Adding an explicit layer of management 
on top of ABE helps the system scale better with more data 
streams and more policies. In summary, our main contributes 
are as follows: 

1. We propose an extension to an ABE scheme which 
provides support for sliding window access control. 

2. We design a system allowing data owner to outsource 
stream data and access control enforcement to the cloud. The 
system supports both trigger and sliding window policies in 
a secure manner. It is also scalable, in the sense that most of 
the expensive operations are done by the cloud. We achieve 
both security and scalability by combining novel cryptographic 
primitives with the standard XACML framework for policy 
management. 

3. We implement a prototype of our system. Through 
preliminary evaluation, we find that the overhead is reasonable. 

The remaining of the paper is organized as follows. The next 




section presents the system and security model. Section III 



details the cryptographic protocols. It is followed by the 
design for policy management. We present our prototype and 
preliminary evaluation in Section [V] Related work is discussed 
in Section IVT1 before we conclude in Section [VTTI 



Fig. 1: The context: (1) negotiation phase: the owner and the 
user agree on an access control policy (2) outsourcing phase: 
the owner encrypts its data and forwards to the cloud (3) 
relaying phase: the cloud processes and forwards the data to 
the authorized users. 



II. System Model 

Our system consists of a number of data owners (or 
owners), one cloud provider and a number of data users 
(or users). As depicted in Figure [TJ a data owner generates 
a stream of data and sends it to the cloud. Several users 
interested in the data will retrieve it through the cloud. The 
owner and users agree on the access policy before-hand. In 
summary, the data is outsourced to the cloud where it will be 
stored, managed and distributed to a set of users. 

A. Data Model 

For simplicity, we assume that data generated by the owner 
are key-value tuples. A stream D is a sequence of (ki,Vi) 
for i — 0,1,2,.. where ki,Vi are integers and the value Vi 
belongs to a known, finite domain. In a participatory sensing 
application, the ki may represent a geographical location 
while v may represent a measurement of air pollution. In a 
time-series application, such as weather monitoring, ki may 
represent a time-stamp value while Vi a temperature reading. 

For a time-series data stream, i.e. D = 
((0, vq), (1, i>i), (2, V2), ••)> we define a data window, 
denoted as w(s,e), as a sequence of data values whose 
keys are in between a starting index s and an ending index 
e. That is: w(s,e) — ( | s < i < e). A non-overlapping 
sliding window — denoted as sw(a,/3) — is a sequence 
of non-overlapping data windows starting from index a, 
having size (3. More precisely, the i th sliding window is 
sw(a, f3)[i] = w(a * i,a * i + j3). 

B. Access Control Model. 

An access control policy is defined by the data that an 
authorized user can read. We consider two types of policies: 
trigger and sliding window. In the former, the user is granted 
access to a tuple (ki,Vi) (i.e. it can read the value Vi) if 
ki is equal to or exceeds a threshold 9. Denote r(9,op) as 
the trigger policy, where 9 is a threshold value and op is a 
comparison operator, then: 

t(9, op) = ( Vi I ki op 9 = true ) 

In a sliding window policy, an user can only access the 
averages of the data tuples that constitute a non-overlapping 
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sliding window. In other words, the user cannot read individual 
data tuples, but only the summaries. Denote n(a,/3) as the 
sliding window policy with a being the starting index and f3 
being the window size, we have: 

w(a,P) = (^\v in sw(a,j3)\i];i = 1,2,..) 

As an example, suppose a data owner is to share a weather 
database containing temperature readings collected at five- 
minute intervals starting from time 0. A sliding window policy 
may require that the user can only access the average temper- 
ature every 1 hour, starting from time 3. This corresponding 
to the policy 7r(3, 12), meaning that the user can only read the 
averages of the windows [3 — 14], [15 — 26], [27 — 38], etc. 

C. Trust Model 

We assume that data owners are honest. They outsource 
data streams to the cloud and wish to protect privacy of their 
data against the cloud. They define access policies for a set 
of users and wish to enforce the policies in a secure manner 
that does not permit unauthorized data access. The cloud is 
not trusted in that it tries to learn sensitive information in the 
data, and may collude with rogue users in order to provide 
them unauthorized access to the data. However, the cloud is 
trusted to carry out computations delegated to it by the owners 
and users. This means it does not skip or distort computations. 
A data user may be dishonest, in which case he attempts to 
gain unauthorized access, for which he may collude with the 
cloud. Finally, we assume that users do not collude with each 
other (we explain the reason for this assumption shortly). 

D. System Design Goals 

Given the models above, we now state informally the goals 
that our system aims to achieve. 

Goal 1 (privacy). The cloud cannot learn the outsourced 
data in plaintext. 

Goal 2 (access control, trigger). An user given access ac- 
cording to the policy r(6>, op) cannot access values belonging 
to another policy t(0', op') where 8 ^ 6' or op ^ op' . 

Goal 3 (access control, sliding window). An user given 
access according to the policy ir(a,/3) cannot access values 
belonging to any different policy. Specifically, the user cannot 
see the values belonging to 7r(a',/3') where a' < a or 
a' a mod (3. This means the user does not have access 
to data windows starting before a. He can access windows 
starting from later indices, but those indices must be a multiple 
of a, which is the same as skipping a number of windows. 
Moreover, the user cannot see the values defined by ir(a', f3') 
where /?' < (3 or j3' =^ mod (3. This means the user cannot 
access data at finer granularity, nor can it access windows 
whose sizes are not multiples of j3. For instance, if an user 
is given access to windows of size 5, he must not be able to 
access windows of size 3 or size 7. It is worth noticing that 
the assumption about users not colluding with each others 
is necessary to achieve this goal. If two users with 7r(a,/3) 
and Tr(a,(3') where /? ^ mod f3' collude, they can derive 
windows of size \/3 — f3'\ or even windows of size 1. 



Goal 4 (scalability). Most of the expensive computation 
should be off-loaded to the cloud. The protocols involving the 
data owners and users should be light-weight. 

III. The Protocols 

A. Overview 

Our protocols for both trigger and sliding window policies 
consist of 3 phases, as illustrated in Figure [T] First, the data 
owner and user negotiate an access control policy which can be 
either a trigger or sliding window policy (negotiation phase). 
Next, the data owner encrypts its data and sends it to the 
cloud (outsourcing phase). Upon receiving the ciphertexts, the 
cloud performs transformation on the ciphertexts and forwards 
the results to the authorized users (relaying phase). The users 
receive messages from the cloud and decrypt them to obtain 
the plaintext data. 

A naive approach would be for the data owner to generate 
one encryption key for each policy, share this key to the 
authorized users, and encrypt the data using this key. This 
does not scale for two reasons. First, the amount of duplicated 
data that must be encrypted grows linearly with the number 
of policies. Second, for a sliding window policy 7r(a,/3), 
the owner has to transform the data locally and update the 
encrypted version to the cloud. This process must be done for 
every combination of a and j3. 

In our protocol, there is only encrypted copy of the data 
stored at the cloud for the trigger policies, and up-to w 
copies for sliding window policies where w is the number 
of unique window sizes. To this end, we make use of three 
important cryptographic primitives: attribute-based encryption 
(ABE), proxy re-encryption (PRE) and additive homomorphic 
encryption (AHE). ABE is used to enforce threshold condi- 
tions specified in trigger policies, as well as the condition on 
the starting index in sliding window policies. PRE is used by 
the cloud to transform ABE ciphertexts to simpler ciphertexts 
that can be decrypted by users. The aims of this transformation 
is to relieve the users from carrying out expensive operations 
involving ABE. For sliding window policies, AHE enables 
the cloud to compute the encrypted aggregates of the data 
windows over the encrypted text without breaking data privacy. 

B. Cryptographic Primitives 

1) Attribute Based Encryption.: Attribute-Based Encryp- 
tion (ABE) schemes produce ciphertexts that can only be 
decrypted if a set of attributes satisfies certain conditions (or 
policies). Two types of ABE exists: key-policy ABE l23l 
(KP-ABE) and ciphertext-policy ABE |H (CP-ABE). In both 
cases, a policy is expressed as an access structure over a set of 
attributes. For example, if the policy is defined as the predicate 
T = ai A (a 2 V a 3 ), then T({ ai }) = T({a 2 ,a 3 }) = false, 
T({ai,a,2}) = T({ai,a,3}) = true. The plaintext can be 
recovered only if evaluating the policy over the given attributes 
returns true. 

In KP-ABE, the message m is encrypted with a set of 
attributes A and an access structure T is embedded in a decryp- 
tion key. In CP- ABE, T is embedded within the ciphertext and 
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the user is given a set of attribute A. In both cases, decryption 
will succeed and return m if T(A) — true. KP-ABE and CP- 
ABE are based on secret sharing schemes that generate shares 
in a way that the original secret will only be reconstructed if a 
certain access structure is satisfied. Constructions of KP-ABE 
and CP-ABE for any linear secret sharing scheme (modelled 
as a span program) are described in l23l . Ifl4ll . In practice, the 
schemes rely on threshold tree structures for secret sharing, in 
which non-leaf nodes are threshold gates (AND and OR). 

We adopt KP-ABE in our work, which is more suitable 
than CP- ABE, because the attributes are based on the plaintext 
message (the key value k). Although KP-ABE and CP-ABE 
are similar and in some cases can be used interchangeably, 
the former is more data-centric (concerning the question of 
who gets access to the given data), whereas the latter is more 
user-centric (concerning the question of which data the given 
user has access to). For the trigger policy t(9, op) we encrypt 
(ki,Vi) using KP-ABE with fc,; as the attribute and k op 9 = 
true as the policy. Similarly, for the sliding window policy 
7r(a, /3), we encrypt (ki, Uj) using ki as the attribute and k > a 
as the policy. 

Since the policies involve arithmetic comparison, we make 
use of the bag-of-bits method that translates an integer into a 
set of attributes. For example, with 64-bit integers, an attribute 
3 can be mapped to {x..xlx,x..xl}, and the policy k > 11 
can be mapped into (ge2exp4 V ge2exp8 V ge2expl6 V 
ge2expZ2)\l (x.Axxx A (x.Axx V (x..xlx A x..xxl))) where 
ge2expm is an attribute representing values greater than or 
equal to 2 m . It can be seen that a policy involving equality 
condition often involves more attributes than a policy with an 
inequality condition. 

2) Proxy Re-Encryption.: A Proxy Re-Encryption scheme 
(PRE) lfl5l enables a third-party to transform one ciphertext 
to another without learning the plaintext. Using a transfor- 
mation key Kij^v it can convert c = Enc(Kjj, m) to 
d = Enc(Kv, m) which can be decrypted with Ky, without 
learning m. Traditional PRE schemes have been used for 
distributed file systems, in a way that saves the data owner 
from having different copies of the same data for different 
users. In this work, we extend a recent proxy-based ABE 
scheme ll24l which can transform a KP-ABE ciphertext into 
a Elgamal-like ciphertext whenever the access structure is 
satisfied. This scheme has the benefit of traditional PRE 
schemes, and it relieves users from expensive computations 
involving ABE. 

3) Additive Homomorphic Encryption.: Sliding window 
policies require that authorized users have access to the 
averages of data values within non-overlapping windows. 
To allow for such an aggregate operation over ciphertexts, 
the encryption scheme must be additively homomorphic, i.e. 
Enc(if, mi) © Enc(-ftT, 7712) = Enc(K, mi + m%) for an 
operator ©. Paillier ll32ll is one of such schemes, which is 
based on composite residuosity classes of group Zjy^ where 
N is at least 1024-bit. 

In this work, we use an Elgamal-like encryption scheme to 
achieve additive homomorphism, which can also be integrated 
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Fig. 2: High-level protocol 



with ABE. Essentially, consider a multiplicative group Z p 
with generator g. For m G Z p , K and r as random values 
of Z p , then Enc(K,m) = (g T ', g m .g rK ). The homomor- 
phic property holds, because Enc(K, mi).Enc(A r , 1712) = 
Enc(if,mi + m-i). The drawback of this scheme is that 
decryption requires finding the discrete logarithm of g m in 
base g. As in other works using a similar scheme |20|, 
||26ll , [39], we assume that the plaintext domain is finite and 
small so that discrete logarithms can be computed by brute- 
force IT331 or even pre-computed and cached. For many types 
of applications where data values belong to a small set (such 
that temperature values), this is a reasonable assumption. Note 
that the security is not compromised due to the mapping of 
g m to m, which merely acts as an additional layer used to 
achieve homomorphism. 

To prevent a user from learning the individual values 
constituting a window, we introduce a blind factor into the 
ciphertext, which can only be removed when a proper number 
of ciphertexts are put together. Suppose the windows are of 
size 2, the owner generates <5i , 82 randomly and give a = 
Si + 8 2 to the user. The ciphertext c x = {g ri , g m .g Tl+5l+K ) 
and C2 = {g T2 , g 1712 -g r2+S2+K ) cannot be decrypted individ- 
ually, since 81 and 82 are unknown to the user. However, 

C1.C2 = (X = g ri+r2 ,Y = g^i+^2 g K(r 1 +r 2 +S 1 +S 2 )^ caQ be 

decrypted by computing j mi+ ™ 2 = xK Y „. K and recovering 
mi + m,2 by taking its discrete logarithm. In our protocol, 
the user only needs to know a for the starting window, with 
which the unblind factors for the subsequent windows can be 
efficiently computed. 

C. Protocol for Sliding Window Policies 

Figure [2] illustrates the high-level protocol for sliding win- 
dow policies. In the negotiation phase, data owner and user 
agrees on the sliding window parameters a, (3. The owner 
generates an access tree T for the condition k > a, a 
transformation key TK and a decryption key K for the user. It 
sends (a, /3, TK, K) to the user and (a, /3, TK) to the cloud. 
For every tuple (ki, vi), the owner encrypts Vi using KP-ABE 
with attribute Each tuple is encrypted uj times, each for 
a different window size j3 and using a different blind factor. 
During the relaying phase, the cloud selects ciphertexts whose 
attribute is greater or equal to a, and transforms them using 
TK into other ciphertexts that can be decrypted with another 
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key K' (K 7^ K'). Every /3 of the transformed ciphertexts are 
used as inputs to the ComputeSum function which produces 
an encrypted sum value which can be decrypted using K. 

Our protocol relies on the properties of bilinear maps for 
security. Let G, G\ be two multiplicative groups of prime order 
p, g be the generator of G. e : G x G — > G\ is a bilinear map 
if it is efficiently computable, e(u a , v b ) — e(u, v) aM for u, v £ 
G and e(g,g) 7^ 1. Given random group elements g a ,g b ,g c , 
it is difficult to compute e(g,g) abc (Computational Bilinear 
Diffie-Hellman Assumption). Also, it is difficult to distinguish 
e(g, g) abc from a random element e(g, g) z (Decisional Bilinear 
Diffie-Hellman Assumption). 

Negotiation phase. During setup, all parties agree on 
a security parameter A. Let U be the attribute universe, 
i.e. U = {x..xO, x..xl, x..0x, ...}. The owner chooses a 
master secret value y £ Z p , and other secret tx, t%, ..iim. 
For every window size (3, it generates a set of random 
values R = {ro, r\, .., rp-i}. The public key contains 
e{g,g)\g t \g t \..,g t ^. 

Let u be an user with a sliding policy 7r(a,/3). The owner 
generates a random value z u £ Z p . Next, the policy k > a is 
converted into an access tree whose leaves are the bag-of-bit 
representation of k, as explained in (23). Every non-leaf node 
is a threshold gate: an AND node is a 2-out-of-2 threshold 
gate, an OR node is a l-out-of-2 gate. This tree is used to 
share the master secret y. For a node x with threshold d x , 
choose a random polynomial q x of degree d x — 1 (so that 
Xd points are needed to reconstruct the polynomial). For the 
root node, set q r (0) — V- For any other node, set q x (0) = 
q P arent(x) {index (x)) where q parent ( x ) is the polynomial of 
cc's parent and index(x) is the node index. The leaf node x 

associated with an attribute i is given the value D x — g z ^ A i . 
For the policy w(a,f3), the owner computes: 

0-1 

a(a, P) = J2 2^+^ R[a + j mod 0\ 
3=0 

Finally, the user is sent 

{a,j3,{D x },(z u ,a(a,/3))) 

where {D x } and (z u , a(a, f3)) are the transformation and user 
decryption keys respectively. 

Outsourcing phase. To encrypt (k, v) € D, the data owner 
first translates k into bag-of-bit attributes B^. Next, it chooses 



a random value e Z p and computes E[ — g ti - Sk for all 
attribute i £ B^. It also computes: 

E(k, V ) = gV+^ k/n -R[* mod H\. e fa g y.»k 

Finally, the following ciphertext is forwarded to the cloud: 

c fe = {k,B k ,E(k,v),{El}) 

Relaying phase. The transformation proceeds recursively 
as follows. When a; is a leaf node associated with attribute i: 

Transform^) = / < D ^ E i) = e&g)**^ when i e B, 
_L otherwise 
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When a; is a non-leaf node, we call Transform(w) for all 
w that are x's children. Let Ai,s(x) = Ylj e $ l^f ^ e tne 
Lagrange coefficient for an element i £ 7L V and S C Z p . Let 
F w be the output of Transform(w), and S x be the set of x's 
children such that F w 7^ _L. If S x = 0, return _L. Otherwise: 

F x = Transform^) 

F w lndex<ro) - s x^ ^ w here S' x = {index(w) \ w £ S x } 

= e(3,5) Sfe * (0)/2u 
By calling Transform on the root node of the access tree, 
we obtain Trans form(rootk) = e(g, g) v ' Sk ^ Zu . Let — 
(E(k, v), Transf orm(rootfe)). For the i th window, the cloud 
has to gather the full set of C = {Ck, Ck+i, ••, Cfc+/3-i} for 
k = a + i./3. Then, it executes ComputeSum as follows: 

/3-1 

(E 1 ,E 2 ) = ComputeSum(C) = ]^[ C k +j 

j=o 

13-1 p-1 

= E ( k +j> v k+j), Iransform(roo^ +3 )) 

and sends (i = ^^,£'1,^2) to the user. 

Having received this message from the cloud, the user 
computes: 

_ Fh_ _ E(k,v).E(k + l,v 1 )..E(k + p-l,vp- 1 ) 
Wi ~ E^ ' (e(g,g)v- s k/**..e(g,g)y- s *+P-i/^) z " 

g(v k + ..+v k+fl _ 1 )+2\a(a,P) _ & lg^ g\ Ofc + - -+s k +ft- 1 )-y/zu 

(e(g, g)( s k+--+sk+/3-i)-y/z u y u 

_ g(v k + ..+v k + p- 1 )+2 i .a(a,0) 

Finally, the average is computed as: 

discreteLog( oi w j a , ) „, , ,., 



avg, 



P 



D. Protocol for Trigger Policy 

The protocol for trigger policies is very similar to that for 
sliding window. Its main differences are that the access policies 
may involve other conditions besides > as in sliding window, 
and that no blind factor is needed since authorized users can 
access individual data tuples. 

During the negotiation phase, the owner create the pub- 
lic key containing e(g, g) v , g fl , g t2 , .., g*! 17 as before. Next, 
the access structure for the policy kop9 is constructed. 
The transformation key contains {D x } for all leaf node x, 
and the user decryption key is z u . During the outsourc- 
ing phase, a ciphertext sent to the cloud is generated in 
the same way as with sliding window policy, except for 
E(k,v) = g v .e(g, g) y Sk . During transformation, the cloud 
computes Trans form(root k ) = e(g, g) y ^ k ^ Zu as before 
and sends C k — (E(k,v),Transform(root k )) to the user, 
who decrypts it by taking the discrete logarithm of g v = 

E(k,v) 

k (Transformfroofj.)) s w 
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E. Discussions 

Security. The protocols described above provide data pri- 
vacy with respect to the cloud, because we use an encryption 
scheme which is an extension of the proxy-based ABE scheme 
proposed in E4l . As shown in ll24ll . the scheme is Replayable- 
CCA (or RCCA) secure. Our encryption is secure in the 
standard model (as opposed to security in random oracle 
model), since we use the small-universe construction of ABE. 

The access control property for trigger policies hold as a 
result of ABE. For sliding window policies, we now discuss 
an informal proof showing that unauthorized access is not 
possible. Particularly, we need to show that an user given 
access according to a policy 7r(a, (3) cannot decrypt values 
belonging to other policies ir(a' , (3') where a' < a, f3' < (3, 
a' 7^ a mod (3 or f3' ^ mod (3. Since the ciphertext is 
encrypted with the attribute k and the policy embedded in the 
transformation key requires k > a, the Transform operation 
will fail for k < a. Thus, the user is prevented from learning 
the values in ir(a',f3') for a' < a. From a(a,(3), one can 
infer the value a(a',/3) for a' > a and a' — a mod /3, 
since a(a',(3) — 2 a ^ a .a(a,/3). However, for a' ^ a mod j3, 
deriving a(a',/3') from a (a, (3) is not possible as it requires 
knowing the sum of a subset of R. Therefore, the user cannot 
recover the plaintext because the unblind factors are incorrect. 
This similarly holds when (3' < (3 or (3' ^ mod (3, because 
computing the blind factors a(a',/3') in these cases requires 
knowing individual values of R. 

Scalability. The most expensive cryptographic operations 
are exponentiations and pairings. The latter are done only by 
the cloud during transformation. The number of pairings is 
proportional to the size of the access tree. Notice that the 
access tree is likely to be smaller for a sliding window policy 
than for a trigger policy, as there is a smaller number of 
attributes involved in constructing k > x than in constructing 
k = x. Without the cloud, the user will have to perform these 
pairings themselves. Instead, in our protocol, the user only 
performs simple operations such as inversion, multiplication, 
and only one exponentiation. Having to find discrete logarithm 
during decryption is a potential bottleneck. But as discussed 
earlier, we assume the data value domain is known and finite, 
hence the user may pre-compute the discrete logarithm, thus 
reducing this problem into a simple table look-up. If m is the 
maximum value that v can take, the look-up table has the size 
of m for trigger policies, and b.m for sliding policies where 
b is the maximum window size. 

The data owner's computations are during the negotiation 
phase and the outsourcing phase. Public and master key are 
generated only once, hence the cost will amortize over time. 
Generation of transformation keys is done once per policy. 
Compared to data encryption during the outsourcing phase, 
this operation is much less frequent and can be considered 
as a constant. Thus to say, the overall setup cost is a one-off 
cost which amortizes over time. During the outsourcing phase, 
the owner performs one encryption per data value for trigger 
policies, and w encryptions for sliding window policies (w 



being the number of window sizes). The cost per encryption 
is roughly the same for both trigger and sliding policies, and 
is dominated by the cost of 2 + b exponentiations. 

There are overheads incurred when storing the ciphertexts 
in the cloud. First, ABE encryption introduces overheads that 
are proportional to the number of attributes per ciphertext. 
These overhead are constant in our protocol, because the 
number of attributes is fixed at b. Second, the data owner may 
wish to support sliding window policies of different window 
sizes, which entails storing multiple ciphertexts for the same 
data value at the cloud. However, when encrypting (k,v) for 
different values of (3 only E(k, v) will be different. As a result, 
the overhead is limited to iu.||G|| where ||G|| is the number 
of bits per group element. 

IV. Access Control Management 

Before starting Transform operation, the cloud checks 
if the attributes associated with the ciphertext satisfy the 
access structure associated with the policy. This is to avoid 
performing redundant transformations which return _L. As 
the number of streams and policies increase, management 
and matching of policies must be done in a systematic and 
scalable manner. To this end, we adopt a holistic approach 
that leverages XACML framework, allowing the data owner to 
specify access control policies in XML format. It also provides 
a unified framework for managing policies at the cloud, for 
defining access control requests and matching them against 
the policies. Traditionally, XACML is used for access control 
in trusted domains, such as internal systems or in trusted 
collaborations. Our work utilizes XACML solely for policy 
management and not for access control enforcement. The latter 
is instead realized by our cryptographic protocols. 

A. XACML 

XACML is an OASIS framework for specifying and enforc- 
ing access control |4|. It defines standards for writing policies, 
requests, matching of requests against policies and interpreting 
the response. Here, we briefly explain the main components 
of XACML, more details can be found in J4J. 

Requests in XACML are written in XML, which contain 
subject credentials and the system resources being accessed. 
Subjects and resources are specified in the Attribute elements 
included in the Subjects and Resources element respectively. 
Each policy in XACML contains a Target and a set of Rules. 
A Target element consists of a set of matching rules that 
must be met by the request before the rest of the policy 
can be evaluated. The XACML framework comprises two 
main components: a Policy Enforcement Point (PEP) and a 
Policy Decision Point (PDP). Requests are sent to PEP which 
translates them to canonical forms before sending to PDP. 
It interprets responses from PDP and sends the results back 
to the users. Policies are loaded into and managed by PDP. 
After receiving well-formed requests from PEP, it evaluates 
them against the loaded policies and sends the well-formed 
responses back to the PEP. 
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<Subjects> <Subject> 

<Subj ectMatch Matchld="urn: oasis : names : tc : xacml : 1 . : function: integer-less-than-or-equal"> 
<Attribute Value DataType="http : //www . w3 . org/2001/XMLSchema#integer ">9</AttributeValue> 
<SubjectAttributeDesignator AttributeId="pCloud: subject : key-id" 

DataType="http : //www . w3 . org/200 1 /XMLS chema# integer "/> 

</SubjectMatch> 
</Subject> </Subjects> 
<Resources> <Resource> 

<ResourceMatch Matchld="urn: oasis : names : tc : xacml : 1 . : function : integer-equal"> 

<Attribute Value DataType="http : //www . w3 . org/2001/XMLSchema#integer ">0</AttribnteValue> 
<ResourceAttributeDesignator AttributeId="pCloud: resource : streamld" 

DataType="http : //www . w3 . org/200 l/XMLSchema#integer "/> 

</ResourceHatch> 
</Resource> </Resources> 

Fig. 3: Policy for sliding window, a = 9 



<Subject> 

<Attribute AttributeId="pCloud : subj ect : key-id" 

DataType="http : //www. w3 . org/2001/XMLSchema#inte| 
<AttributeValue>3</AttributeValue></Attribute> 

</Subject> 
<Resource> 

<Attribute AttributeId="pClbud : resource : streamld" 

DataType="http : //www. w3 . org/2001/XMLSchema#inte| 
<AttributeValue>0</AttributeValuex/Attribute> 

</Resource> 



Fig. 4: Request, k = 3 



B. Using XACML With Our Protocols 

In traditional settings, resource owners write XACML poli- 
cies and upload them to a access control server. Users inter- 
ested in the resource write XACML requests and submit them 
to the server, where they are evaluated by the PDP. If the 
decisions are Permit, the users are granted access. 

In our setting, the cloud acts as the access control server and 
runs an XACML instance. Each data stream is represented as a 
resource. The data owner writes XACML policies representing 
the trigger or sliding window policies. For each policy, the 
cloud maintains a list of users to which the policy is given. The 
user does not have to write or submit XACML requests to the 
cloud. Instead, it is the owner who specifies XACML request 
for every ciphertext it sends to the cloud. Specifically, the 
request generated for the tuple (k, v) contains the value of k in 
the Subject element. Once received at the cloud, it is evaluated 
against the loaded policies. The result is a set of matching 
policies and a set of users to whom access to the value v 
should be granted. Figure [3] illustrates an XACML policy for 
sliding window policies where a = 9. Figure[4]shows a request 
constructed for the tuple (k, v) where k = 3. When evaluated, 
the PDP will return a Deny decision, and hence the cloud will 
not perform transformation for the ciphertext of this tuple. 

V. Prototype and Evaluation 

A. Prototype Implementation 

We implement a prototype^ demonstrating the feasibility 
of the proposed approach and to benchmark the various 
components under simple settings. We extend the KP-ABE 
implementation from 0, which uses PBC library [5| for 
pairing operations. Our extensions mainly concern sliding 
window policies, which include functions for the Transform 
operation and for computing the encrypted sums at the cloud. 
We use a Java implementation of XACML J8) for managing 
and matching of policies. The communication between data 
owners, the cloud and users is implemented in Java, using 
sockets and one-thread-per-connection concurrency model. 



The source code is available at https://code.google.eom/p/streamcloud/ 



B. Microbenchmark 

We set up simple experiments to investigate the costs of the 
main operations in our protocols. For each operation, we use 
latency as a metric for measuring computation cost, which is 
also representative of the overhead caused by the operation. 

We use the fastest pairing implementation for our experi- 
ments (type A [5 1), whose base field size is 512 bits. There is 
one data owner sending one stream of data with a certain rate. 
There is one data user, to whom the owner specifies either 
a trigger policy or a sliding policy (but not both). We fix the 
sliding window size to 5 and the maximum data value to 1000, 
which implies that the maximum sum of a window is 5000. 
For trigger policies, we set op to be equality comparison. Our 
experiments are run on 3 machines, two desktop machines 
running as the owner and the user, and one laptop running 
as the cloud. Each desktop has 2 2.66Ghz DuoCore CPUs 
and 4GB of RAM running Ubuntu Linux. The laptop has 
2.3Ghz Core i5 CPU, 4GB of RAM and run Snow Leopard. 
All machines are connected via university LAN network. 

Figure [5] depicts the costs of cryptographic operations. The 
setup cost for data owner is the highest, since it involves 
generating public parameters and transformation key. This cost 
dominates the setup time at the user. Recall that this is a one 
time cost, and hence does not affect run time performance 
or practicality of the approach. The cost for sliding window 
policies is slightly smaller than for trigger policies, because 
the transformation keys in the former are of smaller sizes (as 
discussed in the previous section). Another significant cost in 
the setup phase is the cost of pre-computing the look-up table 
for discrete logarithms of values in [0,5000]. The setup time 
at the cloud is much smaller, because it only needs to read and 
initialize the public and transformation keys from byte arrays. 

The cost per encryption is the same for both sliding window 
and trigger policies (as predicted in the previous section), 
which is around 179(ms). We believe this cost is reason- 
able, especially considering that data streams in reality often 
generate data in very long intervals (in the order of seconds 
or even minutes). The transformation cost at the cloud for 
trigger policies is orders of magnitude larger than for sliding 
window policies. This is because transformation for the policy 
k — 9 requires 64 pairings, whereas for policy k > a only 1 
pairing is needed for large values of k. The maximum latency 
is around 120(ms), which we also consider as reasonable. 
Decryption cost at the user is an order of magnitude smaller 
than transformation at the cloud, which further illustrates the 
benefit of outsourcing bulk of the computation to the cloud. 

In order to understand the overhead of encryption with 
respect to the user-perceived latency, we measure the elapsed 
time between decrypted values at the user. For trigger policies, 
we fix k to be the same as 9, so that the the user has access 
to all the data. Figure [6] shows the inter-arrival time at the 
user. There exists a lower-bound on this metric when we 
increase the sending rates (by decreasing the send intervals). 
This boundary is exactly at the cost per encryption. This is 
expected since encryption becomes the bottleneck at the data 
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(a) Data owner (b) Cloud 

Fig. 5: Time spent on cryptographic operations 



(c) Data user 



baseline (no overhead), simple 

sliding window ■ □ 
baselline (no overhead), sliding window 



Send interval (ms) 



Fig. 6: Overhead when increasing the sending rate 



1000 
# policies 



Fig. 7: XACML processing time 



owner for high sending rates. Finally, we investigate the cost 
of running the XACML framework. We upload a large number 
of policies and send matching requests to the cloud. The cost 
for policy matching is depicted in Figure [7] which is small and 
scale gracefully with more policies. It suggests the system's 
scalability for large numbers of policies, streams and users. 

VI. Related Work 

Existing works on access control for stream data assume 
trusted domains and focus on the specification and enforce- 
ment of access policies ifTTl . ifTBTl . l30l . They are different 
to our work which considers outsourcing access control to 
an untmsted environment. Similarly, previous works that use 
XACML for fine-grained access control have also focused on 
trusted domains [21 1. We use XACML only for policy manage- 
ment, and rely on encryption schemes for policy enforcement. 



When the cloud is untmsted, systems such as CryptDb P4l 
and LLP [26] proposed systems that guarantee data privacy 
while still enabling meaningful, database-related queries on 
ciphertexts. Our work also provides data privacy, i.e. the cloud 
cannot learn plaintext values, but it focuses on the question of 
who can access the data and what the access granularity is. Ac- 
cess control on untmsted environment have been investigated 
in 11271 . iPlTI . both of which deal with archival data. Our use of 
ABE allows for more flexible, fine-grained access control than 
in 11271 . The system proposed in [41] also employs KP-ABE 
and the cloud as a proxy, but it does not work with stream data, 
nor does it support sliding window policies. Furthermore, the 
cloud's main role in ATI is during key revocation requiring 
data re-encryption, and unlike in our work, the cloud does not 
alleviate the users from expensive computations. 

Attribute-Based Encryption schemes are being actively re- 
searched EH, 01], E3, El, (29). Our protocols can be 
viewed as an extension of the proxy -based ABE scheme |24|. 
Compared to the original work, we have built a system that 
support meaningful access control policies for stream data. We 
have additionally integrated XACML framework for access 
control management. Finally, we have provided a holistic 
prototype implementation and preliminary evaluation for the 
encryption scheme as well as for the system as a whole. 

VII. Conclusions and Future Work 

In this paper, we have proposed a system that allows secure 
and scalable outsourcing of access control to the cloud. Our 
system works with stream data for which access control 
is more complex than archival data. We have extended an 
Attribute-Based Encryption scheme to support trigger and 
sliding window policies. We have employed the cloud as a 
computational proxy which performs the expensive crypto- 
graphic operations on behalf of the user. In particular, the 
cloud transforms complex ABE ciphertexts into Elgamal- 
like ciphertexts that are decrypted only by authorized users. 
We have integrated the standard XACML framework to aid 
policy management. Thus, our approach for access control 
outsourcing is holistic and is able to scale for large numbers 
of data streams and policies. Preliminary evaluation using a 
prototype indicates not only the feasibility but also the benefits 
of access control outsourcing. 



9 



While we have presented a working system, showing a 
proof-of-concept that outsourcing access control is feasible 
and practical for stream data, there is much left for future 
work. First, our current prototype is not capable of dealing 
with large, concurrent workloads. Our immediate plan is to 
either improve our prototype using event driven architecture 
(SEDA ll40l ) as a replacement for our current one-thread- 
per-connection concurrency model, or to use existing high- 
throughput stream processing engines such as STREAM Q 
or Borealis [19| followed by experiments at larger scale (more 
streams, more policies, more users, more cloud servers) over 
a real cloud infrastructure. 

The current system can be extended to support multi- 
value streams by performing different encryptions for different 
values. We have not considered access revocation, which in our 
case requires the data owner to change the attribute set during 
encryption. We plan to investigate if existing revocable KP- 
ABE schemes [12| can be integrated in our work, especially if 
they allows revocation to be outsourced to an untrusted cloud. 
Other interesting extensions are to add support for policies 
with negative attributes [31], and for encryption with hidden 
attributes. The former allows for a wider range of access 
policies, whereas the latter provides attribute privacy which 
is necessary when encryption attributes are the actual data. 

Finally, we would like to relax the trust assumption for the 
cloud. Currently, the cloud performs all the computations del- 
egated to it in an honest manner. However, a malicious cloud 
which skips or distorts computations will greatly increase the 
cost of security while maintaining a good service quality. 
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